DATE: Thursday, January 23, 2025
TIME: 12:00 pm - 1:00 pm
PLACE:
CIC 4th floor 4105 (Panther Hollow)
SPEAKER: Jeff Mogul, Google
TITLE: Thinking About Availability in Large Service Infrastructures, and Especially in Cloud Networks
ABSTRACT:
This talk covers several papers from HotOS 2017 and 2019, plus some additional material
We increasingly depend on the availability of online services, whose availability depends in complex ways on the availability of a complex underlying set of invisible infrastructure services. Most software engineers lack useful frameworks to create and evaluate designs for individual services that support end-to-end availability in these infrastructures, especially given cost, performance, and other constraints on viable commercial services. Even given the extensive research literature on techniques for replicated state machines and other fault-tolerance mechanisms, we found little help in this literature for addressing infrastructure-wide availability.
The first part of the talk argues that, in many but not all ways, one can think about availability with the mindset that we have learned to use for security, and discusses some general techniques that appear useful for implementing and operating high-availability infrastructures.
The second part does a deeper dive into the specific problem of defining SLOs for cloud networks.
The third part revisits the question of SLO definition, and suggests that the problem needs to be reframed for cloud-computing providers, to incorporate some "statistical thinking" and expectations about customer behavior, not just about provider behavior.
BIO:
Jeff Mogul works on fast, cheap, reliable, and flexible infrastructure for Google. Until 2013, he was Fellow at HP Labs, doing research primarily on computer networks and operating systems issues for enterprise and cloud computer systems; previously, he worked at the DEC/Compaq Western Research Lab. He received his PhD from Stanford in 1986, an MS from Stanford in 1980, and an SB from MIT in 1979. He is an ACM Fellow. Jeff is the author or co-author of several Internet Standards; he contributed extensively to the HTTP/1.1 specification. He was an associate editor of Internetworking: Research and Experience, and has been the chair or co-chair of a variety of conferences and workshops, including SIGCOMM, OSDI, NSDI, USENIX, HotOS, and ANCS.
VISITOR HOST: Justine Sherry
VISITOR COORDINATOR: Emily Spencer
SDI SEMINAR QUESTIONS?
Karen Lindenfelser, 86716, or visit www.pdl.cmu.edu/SDI/